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DETAILED ACTION 

1 . Claims 1-23 are pending. 

2. Claims 21 -23 have been added. 

3. The 35 USC 101 rejection pertaining to claim 10 has been withdrawn. 

4. The newly submitted abstract has be considered and is accepted. 



Response to Arguments 

5. Applicant's arguments filed 9/24/2008 have been fully considered but they are 

not persuasive. 

6. Applicant argues "Page 4 of the Office Action alleges that Kanevsky discloses 
generating a probability model in which information indicating each word of a text 
document is made to correspond to a latent variable, as required by claims 1 , 10, 1 1 , 
and 12." (Remarks, Page 12, If 5) The examiner respectfully disagrees. In response to 
applicant's argument that the references fail to show certain features of applicant's 
invention, it is noted that the features upon which applicant relies (i.e., training data) are 
not recited in the rejected claim(s). Although the claims are interpreted in light of the 
specification, limitations from the specification are not read into the claims. See In re 
Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Gir. 1993). Furthermore, it is noted 
that the system of the instant application must have some sort of training data for at 
least the initial generation of the models. Without at least the preliminary comparison to 
known topics or words or segments, etc, there is no way to generate the models 
correctly. Evidence that at least some training data is needed can be found in the 
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specification, Page 15, 1| 2, where the vocabulary size must be known for the model 
initializing unit. 

7. Applicant argues "Kanevsky does not generate a probability model that meets 
the limitations of claims 1, 10, 11, and 12, ... Claims 1, 10, 11, and 12 require that the 
initial value of a model parameter defines the generated probability model. Thus, 
without a comparable probability model, this limitation also clearly (is) not taught by 
Kanevsky" (Remarks, Page 13, H 4) The examiner respectfully disagrees. Kanevsky 
uses a trained battery of topics to define the models therefore an initial value defines the 
generated probability model. 

8. Applicant argues "Contrary to the Office Action's assertions on Page 5, the 
neutral topic does not establish a generality that can be further defined." (Remarks, 
Page 14, H 1 ) The examiner agrees that the neutral topic does not, however the 
candidate topic Tj does. (Kanevsky, column 4, lines 52-67) As is shown, an initial topic 
is chosen, and compared against the likelihoods of competing topics to determine the 
most appropriate topic. Therefore, the topic is initially determined and further defined. 

9. Applicant argues "Kanevsky does not employ a model in text segmentation. 
Combining Rabiner with Kanevsky for the purpose of having multiple models will not 
teach the above limitation of claim 2. Kanevsky and Rabiner combined will still lack the 
step of selecting a model on the basis of estimated model parameters." (Remarks, Page 
15, H 3) The examiner respectfully disagrees. The likelihood measures as described in 
the rejection of claim 1 are defined by models in Kanevsky (further shown in Kanevsky, 
column 8, lines 1-7) Therefore Kanevsky employs models in the text segmentation and 
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the combination of Rabiner with Kanevsky teaches the instant application for the 
selection of the best model. 

Information Disclosure Statement 

10. The Information Disclosure Statement (IDS) submitted on 9/24/2008 is in 
compliance with the provisions of 37 CFR 1 .97. 

Specification 

1 1 . The title of the invention is not descriptive. A new title is required that is clearly 
indicative of the invention to which the claims are directed. 

Drawings 

12. The drawings filed on 7/14/2006 are objected to by the examiner. Fig. 5 should 
be labeled as prior art. 

Claim Objections 

13. Claim 22 objected to because of the following informalities: There is a period 
inappropriately placed in the claim language. Appropriate correction is required. 



Claim Rejections - 35 USC §112 

The following is a quotation of the first paragraph of 35 U.S.C. 1 1 2: 
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The specification sliall contain a written description of the invention, and of the manner and process of 
mailing and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the 
art to which it pertains, or with which it is most nearly connected, to make and use the same and shall 
set forth the best mode contemplated by the inventor of carrying out his invention. 

14. Claims 1-23 are rejected under 35 U.S.C. 112, first paragraph, as failing to 
comply with the enablement requirement. The claim(s) contains subject matter which 
was not described in the specification in such a way as to enable one skilled in the art to 
which it pertains, or with which it is most nearly connected, to make and/or use the 
invention. The applicant argues that no training data is used in the generation of the 
models. It is not understood by the examiner how the models will be sufficiently correct 
without training data in at least one of the generating unit, initializing unit, and the 
estimating unit. For the purposes of examination, all the claims will be interpreted as 
having some sort of training data to develop correct hypothesis. Further clarification is 
needed. 

The following is a quotation of the second paragraph of 35 U.S.C. 112: 

The specification shall conclude with one or more claims particularly pointing out and distinctly 
claiming the subject matter which the applicant regards as his Invention. 

1 5. Regarding claims 21 -23, the phrase "on the one hand ... on the other hand" 
renders the claim indefinite because it is unclear whether the limitations following the 
phrase are part of the claimed invention. See MPEP § 2173.05(d). 



Claim Rejections - 35 USC § 102 

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
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form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(b) the Invention was patented or described In a printed publication In this or a foreign country or In public 
use or on sale In this country, more than one year prior to the date of application for patent In the United 
States. 

1 6. Claims 1 , 3, 5, 6, 1 0-1 2, 1 4, 1 6, 1 7, 21 , and 23 are rejected under 35 
U.S.C. 102(b) as being anticipated by Kanevsky et al. (US Patent #6104989). 

As per claim 1 , Kanevsky discloses: 

generating a probability model in which information indicating which word of a 
text document belongs to which topic is made to correspond to a latent variable and 
each word of the text document is made to correspond to an observable variable 
(Kanevsky, column 2, lines 25-28, ...The present invention implements a content-based 
approacli tliat exploits tlie analogy to speecli recognition, allowing segmentation to be 
treated as a Hidden Markov Model (HMM) process..., Furthermore, Kanevsky, column 
2, lines 45-53, further ..._Tlie metric includes some likeliiiood measure that a word string 
extending from the cun-ent word to the prior word will be found in a context of a topic in 
the battery... Each word in the document is made to correspond to some kind 
relationship to corresponding words which teaches the observable variable. The latent 
variable is that the word itself is not identifiable to a topic without its surrounding words.) 

outputting an initial value of a model parameter which defines the generated 
probability model (Kanevsky, column 5, lines 1-2, ...If a conclusion is reached 

that a current topic is not in the list, declare T as the current topic. . . T has been 
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established as a neutral topic wherein it describes a general distribution and initial value 
of a model parameter for when a topic cannot be found in the battery. Furthermore, 
Kanevsky, column 4, lines 13-19, describes that it defines the generated probability 
model because it establishes a generality that can be further defined while also 
retaining its location with respect to its detected applicable words.) 

estimating a model parameter corresponding to a text document as a processing 
target on the basis of the output initial value of the model parameter and the text 
document (Kanevsky, column 4-5, lines 57-67, 1-5, discloses that a 

maximum likelihood is calculated for the topic that is applicable to the text, ...If a 
conclusion is reached that a topic is not in the list, declare Tthe current topic... T is 
used to preserve space for further processing. The topics are estimated corresponding 
to a text document based on the battery, but also on if the neutral topic is used.) 

segmenting the text document as the processing target for each topic on the 
basis of the estimated model parameter (Kavensky, column 8-9, lines 65-67, 1 -8, 
...A block 400 contains a text that should be translated. This text is segmented (404) 
with topic onsets and labeled with topics in a block 401 using likelihood ratios 403 as in 
explanations in FIG. 4 While text data is accumulated to proceed with topic identification 
of a segment it is stored in the buffer 402. . . , The segmentation occurs as a result of the 
likelihood ratios and the topic labeling.) 

As per claim 3, claim 1 is incorporated and Kanevsky discloses: 

a probability model is a hidden Markov model. (Kanevsky, column 2, lines 
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25-28, ...The present invention implements a content-based approach that exploits the 
analogy to speech recognition, allowing segmentation to be treated as a Hidden Markov 
Model (HMM) process. ..) 

As per claim 5, claim 3 is incorporated and Kanevsky discloses: 

the hidden Markov model is of a discrete output type. (Kavensky, column 

8-9, lines 65-67, 1-8, ...A block 400 contains a text that should be translated. This text 
is segmented (404) with topic onsets and labeled with topics in a block 401 using 
likelihood ratios 403 as in explanations in FIG. 1 While text data is accumulated to 
proceed with topic identification of a segment it is stored in the buffer 402. After a topic 
of the current segment was established a text segment from a buffer is sent to 405 for 
translation. A machine 405 performs translation on each homogenous segment using 
different language models that were trained for each topic. An output of the machine 
405 is a translated text 406... The HMM model responsible for segmenting the text 
prepares the text for translation which would inherently be output in a segmented 
discrete output type to a sequence of words from the initial string.) 

As per claim 6, claim 1 is incorporated and Kanevsky discloses: 

the step of estimating a model parameter comprises the step of estimating a 
model parameter by using one of maximum likelihood estimation and maximum a 
posteriori estimation (Kanevsky, column 4-5, lines 57-67, 1-5, discloses 

that a maximum likelihood is calculated for the topic that is applicable to the text, ...If a 
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conclusion is readied that a topic is not in the list, declare T the current topic. . . , T is 
used to preserve space for further processing. Tlie topics are estimated corresponding 
to a text document based on the battery, but also on if the neutral topic is used.) 

Claims 12, 14, 16, and 17 are rejected under the same principles for being the 
apparatus claims to the corresponding method claims 1, 3, 5, and 6. Each of the stated 
corresponding claims have parallel limitations between the method and the device and 
the hardware aspect of claims 12, 14, 16, and 17 are taught by (Kanevsky, claims 13- 
24, which define the apparatus directed to the previous method claims for practicing the 
invention. 

As per claim 21 , Kanevsky teaches: 

estimating a parameter of a probability model so that the probability of the text 
document being output is maximized or locally maximized, said probability model being 
on the one hand determined for each latent variable representing which word of the text 
document belongs to what number of topics and on the other hand defined by a 
probability of the word being output and a probability of the topic transitioning; and 
(Kanevsky, column 4, lines 52-67, ...Find a candidate topic T, for which the lil<elihood of 
the text is maximal..., the likelihoods are for the text (word) which belongs to the topic 
(latent variable). Furthermore, the designation of a topic defines the probability of a 
word being output as it is linked to the maximal probability for the topic. Lastly, column 
5, lines 3-65, . . .some likelihood measure for "seeing" a given string of words. . . teaches 
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the transition. 

segmenting the text document for each topic by estimating the value of the latent 
variable for each word on the basis of the parameter of the probability model estimated 
above. (Fig. 4, segmentation) 

Claim 23 is rejected under the same principles for being the apparatus claims to 
the corresponding method claim 21 . Each of the stated corresponding claims have 
parallel limitations between the method and the device and the hardware aspect of 
claim 23 is taught by (Kanevsky, claims 13-24) which define the apparatus directed to 
the previous method claims for practicing the invention. 

Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the phor art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

17. Claims 4, 15, and 22 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Kanevsky et al. (US Patent #6104989). 

As per claim 4, claim 3 is incorporated and Kanevsky fails to fully teach: 
the hidden Markov model has a unidirectional structure. 
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(Kanevsky teaches a real-time system where it would be obvious to someone of 
ordinary skill in the art that a left-to-right hidden Markov model would be used because 
time only flows from left-to-right and not vice versa.) 

Claim 15 Is rejected under the same principles for being the apparatus claims to 
the corresponding method claims 4. Each of the stated corresponding claims have 
parallel limitations between the method and the device and the hardware aspect of 
claim 15 is taught by (Kanevsky, claims 13-24) which define the apparatus directed to 
the previous method claims for practicing the invention 

Claims 10, 1 1 and 22 are rejected under the same principles for being the 
apparatus claims to the corresponding method claims 1 and 21. Each of the stated 
corresponding claims have parallel limitations between the method and the recording 
medium of claims 10, 11, and 22. (Kanevsky, column 8, lines 54-64) teaches a real 
time application of the method. It would have been obvious to someone of ordinary skill 
in the art at the time of the invention that a computer-based system would provide real- 
time functionality and a computer based system needs to be programmed by a 
computer readable recording medium in order to be functional according to the method. 

1 8. Claims 2, 9, 1 3, and 20 are rejected under 35 U.S.C. 1 03(a) as being 
unpatentable over Kanevsky et al. (US Patent #6104989) in view of NPL document "A 
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tutorial on Hidden Markov Models and Selected Application in Speech Recognition" 
hereinafter Rabiner. 

As per claim 2, claim 1 is incorporated and Kanevsky teaches: 

the generation of a probability model; the step of outputting an initial value of a 
model parameter for the probability model; andestlmating a model parameter for the 
probability model. (Kanevsky shows these limitations in the rejection of claim 1 

above.) 

Kanevsky fails to teach, 

multiple probability models 
Rabiner, in analogous art, teaches the above limitation, 

(Rabiner, page 10, discloses multiple HMM models which are applicable to the 
HMM used in Kanevsky. It would be obvious to someone of ordinary skill in the art that 
multiple models could be developed around the number of states or form of the HMM 
models to characterize them differently. Upon use, one will perform the best, so it would 
be obvious that one model would be chosen to be used for segmentation of the text 
document.) 

Rabiner and Kanevsky are analogous art because both pertain to modeling of 
speech. It would be obvious to someone of ordinary skill in the art at the time of the 
invention to combine Rabiner with the Kanevsky device because "In this paper we 
attempt to carefullyand methodically review the theoretical aspects of this type of 
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statistical modeling and show how they have been applied to selected problems in 
machine recognition of speech." (Rabiner, Page 1) Rabiner discloses the statistical 
approach to how the hidden Markov model is used in Kanevsky. 

As per claim 9, claim 2 is incorporated and Kanevsky fails to teach: 

the step of selecting a probability model comprises the step of selecting a 
probability model by using one of an Akaike's information criterion, a minimum 
description length criterion, and a Bayes posteriori probability 
(However, Akaike's information criterion is well known in the art for model selection. 
Since it is obvious to select a model, it would be obvious to someone of ordinary skill to 
use Akaike's information criterion to select a model to determine the best model for 
segmentation of the text.) 

Claims 13 and 20 are rejected under the same principles for being the apparatus 
claims to the corresponding method claims 2 and 9. Each of the stated corresponding 
claims have parallel limitations between the method and the device and the hardware 
aspect of claims 1 3 and 20 are taught by (Kanevsky, claims 1 3-24) which define the 
apparatus directed to the previous method claims for practicing the invention. 

1 9. Claims 7-8, 1 8-1 9 are rejected under 35 U.S.C. 1 03(a) as being unpatentable 
over Kanevsky et al. (US Patent #6104989) in view of NPL document "Bayesian 
Adaptive Learning of the Parameters of Hidden Markov Model for Speech Recognition" 
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As per claim 7, claim 1 is incorporated and Kanevsky fails to teach, but Huo teaches: 

the step of outputting an initial value of a model parameter comprises the step of 
hypothesizing a distribution using the model parameter as a probability variable, and 
outputting an initial value of a hyper-parameter defining the distribution 
(Huo, page 335, ...we do not explicitly sliow tlie parameters of tlie prior PDF (often 
referred to as tlie hyperparameters) which are assigned values by the investigator since 
the values are assigned, and thus initialized...) 

the step of estimating a model parameter comprises the step of estimating a 
hyper-parameter corresponding to a text document as a processing target on the basis 
of the output initial value of the hyper-parameter and the text document 
(Huo, page 335, ...the important issue of prior density estimation is addressed and an 
empirical Bayes method to estimate the hyperparameters of prior density based on the 
moment estimate is proposed..., Furthermore, Huo, 339, teaches that equation 49 is 
used for updating the hyperparameters. Thus, they are based on the initial value and 
are estimated.) 

Huo and Kanevsky are analogous art because Huo's paper concerns the training 
of the HMM model parameters that are used in Kanevsky. It would be obvious to 
someone of ordinary skill in the art at the time of the invention to combine Huo with the 
Kanevsky device because Huo provides algorithms that "are shown to be effective 
especially in the cases in which the training or adaptation data are limited" which would 
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provide an improvement over previous algoritlims. 

As per claim 8, claim 7 is incorporated and Kanevsky fails to teach: 

the step of estimating a hyper-parameter comprises the step of estimating a 

hyper-parameter by using Bayes estimation 

(Huo, page 335, ...the important issue of prior density estimation is addressed 

and an empirical Bayes metliod to estimate tlie liyperparameters of prior density based 

on the moment estimate is proposed..., Bayes estimation is used to estimate the 

parameters.) 

Huo and Kanevsky are analogous art because Huo's paper concerns the training 
of the HMM model parameters that are used in Kanevsky. It would be obvious to 
someone of ordinary skill in the art at the time of the invention to combine Huo with the 
Kanevsky device because Huo provides algorithms that "are shown to be effective 
especially In the cases in which the training or adaptation data are limited" which would 
provide an Improvement over previous algorithms. 

Claims 18 and 19 are rejected under the same principles for being the apparatus 
claims to the corresponding method claims 7 and 8. Each of the stated corresponding 
claims have parallel limitations between the method and the device and the hardware 
aspect of claims 18 and 19 are taught by (Kanevsky, claims 13-24) which define the 
apparatus directed to the previous method claims for practicing the invention. 
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Conclusion 

20. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. Refer to PTO-892, Notice of References Cited for a listing of 
analogous art. 

21 . Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to GREG A. BORSETTI whose telephone number is 
(571)270-3885. The examiner can normally be reached on Monday - Thursday (8am - 
5pm Eastern Time). 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, RICHEMOND DORVIL can be reached on 571-272-7602. The fax phone 
number for the organization where this application or proceeding is assigned is 571- 
273-8300. 

Information regarding the status of an application may be obtained from the 

Patent Application Information Retrieval (PAIR) system. Status information for 

published applications may be obtained from either Private PAIR or Public PAIR. 

Status information for unpublished applications is available through Private PAIR only. 

For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 

you have questions on access to the Private PAIR system, contact the Electronic 

Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 

USPTO Customer Service Representative or access to the automated information 

system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

/Greg A. Borsetti/ 
Examiner, Art Unit 2626 
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/Talivaldis Ivars Smits/ 
Primary Examiner, Art Unit 2626 



